01. Assessment

What is the difference between supervised and unsupervised machine learning?

SOLUTION: Supervised learning requires training labeled data and we know the outcome we are trying to predict. In an unsupervised algorithm your data points are not labeled and we do not necessarily know the outcome we are trying to predict.

What is the ‘k’ in k-means clustering and how do we determine its value?

SOLUTION: The ‘k’ refers to the number of clusters and we can determine it’s value using several techniques like the Elbow Method or Silhouette Method.

When evaluating how many components to include in PCA, what is a good rule of thumb for the total amount of variance to be captured by the kept components?

SOLUTION: 80%